Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a pycall!() macro #4503

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ChayimFriedman2
Copy link
Contributor

@ChayimFriedman2 ChayimFriedman2 commented Aug 30, 2024

This is a draft. The code isn't the most readable in the worlds and it likely contains bugs. It also won't pass CI, as I coded for most recent Python and Rust versions. I am opening this PR to gauge opinions.

I chose the performance over code safety or conciseness. This choice can be reverted, fully or partially. While this PR is very large, I believe we can get into reasonable size if we are ready to give up some of the performance.

The whole new pycall module is a safety boundary. Things aren't marked unsafe even when they should be. Part of this could be fixed, but part is hard because of macro constraints. Users should not be able to use the stuff defined in this module anyway.

This macro allows you to call Python objects with the most convenient syntax and maximum performance.

The implementation is... complicated. The performance requirements mean that we do a lot of type juggling, including clever tricks.

Fixes #4414.

This macro allows you to call Python objects with the most convenient syntax and maximum performance.

The implementation is... complicated. The performance requirements mean that we do a lot of type juggling, including clever tricks. This is a draft; it likely contains bugs and it isn't the most readable.
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! As the implementation is huge and is stated to be draft, I have only skimmed it, so let's focus on the high level points first and can iterate:

  • It looks like there are some stylistic choices like "named_argument"=value rather than named_argument=value, and also (*)args and (**)kwargs. Can you elaborate on the choices here: are they draft syntaxes, are they due to limitations in macro parsing, or are they to prevent ambiguities? I think what I'd like to see is an exact mirror of Python call syntax, but we also have the problem of * being the deref operator in Rust, which creates at least one complication.
  • As per the thread in RFC: py_call! macro #4414 there are maintainers among us who rightly question whether a macro is necessary. Can we use builder functions to achieve the same effect (or indeed just regular functions)? If so, how ergonomic can we make those? My intuition is that because kwargs need to be appended to the args slice, and kwarg names separated out into a tuple, the function forms are non-trivial / high overhead. But it would be good to rule that out rather than proceed based on my hunch.
  • The number of combinations of helpers are enormous. Do you think there are natural points where this PR could be broken up so we could review in chunks and possibly extend the design incrementally? We could potentially merge a limited implementation behind an experimental-pycall-macro feature, for example, and extend functionality as we go through the PRs.

@ChayimFriedman2
Copy link
Contributor Author

ChayimFriedman2 commented Aug 31, 2024

  • It looks like there are some stylistic choices like "named_argument"=value rather than named_argument=value, and also (*)args and (**)kwargs. Can you elaborate on the choices here: are they draft syntaxes, are they due to limitations in macro parsing, or are they to prevent ambiguities? I think what I'd like to see is an exact mirror of Python call syntax, but we also have the problem of * being the deref operator in Rust, which creates at least one complication.

The syntax is named_argument=value, just like in Python. There is another alternative ("named_argument")=value (which is less efficient), for when you want to provide a single argument whose name is only known at runtime (that is, the expression within the parentheses is evaluated at runtime instead of a single identifier string). This is essentially a shortcut for (**)(("named_argument", value),) (i.e. unpacking a single-item tuple of key and value) that will also have the same performance. Of course, we can drop this option if we like.

The syntax for unpacking include parentheses because of the deref ambiguity you mentioned. There are alternatives; for example, we can say if you want to deref you have to parenthesize the value (i.e. (*value)) but I see that as confusing. But in general, the syntax is kind of a placeholder; I focused more on the implementation (since this was more fun).

  • As per the thread in RFC: py_call! macro #4414 there are maintainers among us who rightly question whether a macro is necessary. Can we use builder functions to achieve the same effect (or indeed just regular functions)? If so, how ergonomic can we make those? My intuition is that because kwargs need to be appended to the args slice, and kwarg names separated out into a tuple, the function forms are non-trivial / high overhead. But it would be good to rule that out rather than proceed based on my hunch.

A builder is possible, but:

  • It cannot provide the same level of convenience, and
  • It cannot provide the same level of efficiency. The macro relies on being a macro for two things: validating that keyword argument names are disjoint (when all are known at compile time), and autoderef specialization to choose the most performant implementation. The latter can be replaced by multiple functions (but this will be both less convenient and risk people will accidentally choose the less efficient implementation), but the former cannot. Which means passing kwargs with all-knwon-at-compile-time names will have to be penalized.
  • The number of combinations of helpers are enormous. Do you think there are natural points where this PR could be broken up so we could review in chunks and possibly extend the design incrementally? We could potentially merge a limited implementation behind an experimental-pycall-macro feature, for example, and extend functionality as we go through the PRs.

Yes. We can provide only the most general combinations (which are unpacking with any iterator or any python iterable/mapping), and provide more efficient specializations as we go. From convenience POV, this will be the same as the last iteration, while from performance POV we can add things progressively. There is a question of how much to "predict the future" (i.e. whether to add things to a PR when they will only be needed for future combinators), but I guess we can deal with that in time.

As a last thing, I was noted on Stack Overflow that some of my work is useless ;P While I hope it will become useful with future enhancements to Python, it currently cannot provide any speedup and maybe even slow things down a bit. So we'll have to decide what to do with that too.

@birkenfeld
Copy link
Member

Wow, this is a huge amount of work... and a huge PR. 👍

I for am +1 for a pycall! macro with nice syntax and potential efficiency, but I agree that it should be scaled down initially, and then expanded over time depending on the review bandwidth and user feedback.

As for syntax, I think this hits the sweet spot of being close enough to Python syntax, while resolving the unfortunate clash with * in a fine way.

It it was up to me, I would probably make one change - add a sigil like = to the "computed names" in brackets. I.e. name=value or (=name)=value and the same for obj.method() and obj.(=method)() - to make it obvious that something is being computed here. (name) and name just are a bit too similar, and it might be hard to remember at a glance that the parens are significant.

@encukou
Copy link

encukou commented Sep 9, 2024

FWIW, the linked future enhancements to Python discussion is more about consistent naming for the existing CPython call API, than about performance or behaviour.
If there's some calling style that CPython could add to make calls from Rust faster, let's try to add it now :)

@ChayimFriedman2
Copy link
Contributor Author

ChayimFriedman2 commented Sep 9, 2024

@encukou From what I understand it's also about allowing you to call with tuple and dict without a copy in case you are already holding one, and not providing vectorcall where it's slower (python/cpython#123372). Which are the things I care about.

And maybe also having a faster way to retrieve a non-bound method, which means I can check for vectorcall-ability of a method.

@mejrs
Copy link
Member

mejrs commented Sep 9, 2024

Can the "performance" talk please be backed up by benchmarks? It's hard to judge if there actually is any performance benefit or, for that matter, whether the extra performance is worth the added complexity1...

From where I am standing the current complexity1 of this PR is not worth any extra performance. Even if you managed to pare it down to ~2k LOC it would not just have to be faster, it would have to be significantly faster to justify the added complexity1.

But back to the PR; I think this PR is not the best way forward:

  • it's huge, and looks like it took a lot of effort. We definitely appreciate your enthusiasm (👍 ) but big PRs out of nowhere are bad for several reasons, mostly because they're hard to review and require a big time/effort investment on your part.
  • it looks like there's some kind of arrayvec impl in here; please just use that instead of reinventing the wheel.

Instead, I'd prefer the following approach:

  • create a set of benchmarks of realistic scenarios where you can get a significant benefit by using pyo3_ffi api rather than PyAnyMethods. If there is a big difference then we'd like to see improvements, and it would motivate more work. If the difference is small or non existent, you've just saved yourself a lot of work.
  • change or add to PyAnyMethods to make it better.
  • if the above api gets big and/or confusing, add a macro to forward calls to it

Footnotes

  1. I touched on this slightly in https://github.com/PyO3/pyo3/issues/4414#issuecomment-2267158235 . With "complexity", I don't just mean "this PR is a pain to review". Complexity also involves: Can someone (who is not you) confidently make changes or improvements to it? How easy is it to understand as a user of the library? How much documentation is required? We happen to have a lot of documentation, so adding even more should not be done lightly. 2 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: py_call! macro
5 participants